11 research outputs found

    A genetic algorithm approach for predicting ribonucleic acid sequencing data classification using KNN and decision tree

    Get PDF
    Malaria larvae accept explosive variable lifecycle as they spread across numerous mosquito vector stratosphere. Transcriptomes arise in thousands of diverse parasites. Ribonucleic acid sequencing (RNA-seq) is a prevalent gene expression that has led to enhanced understanding of genetic queries. RNA-seq tests transcript of gene expression, and provides methodological enhancements to machine learning procedures. Researchers have proposed several methods in evaluating and learning biological data. Genetic algorithm (GA) as a feature selection process is used in this study to fetch relevant information from the RNA-Seq Mosquito Anopheles gambiae malaria vector dataset, and evaluates the results using kth nearest neighbor (KNN) and decision tree classification algorithms. The experimental results obtained a classification accuracy of 88.3 and 98.3 percents respectively

    An ICA-ensemble learning approaches for prediction of RNA-seq malaria vector gene expression data classification

    Get PDF
    Malaria parasites introduce outstanding life-phase variations as they grow across multiple atmospheres of the mosquito vector. There are transcriptomes of several thousand different parasites. (RNA-seq) Ribonucleic acid sequencing is a prevalent gene expression tool leading to better understanding of genetic interrogations. RNA-seq measures transcriptions of expressions of genes. Data from RNA-seq necessitate procedural enhancements in machine learning techniques. Researchers have suggested various approached learning for the study of biological data. This study works on ICA feature extraction algorithm to realize dormant components from a huge dimensional RNA-seq vector dataset, and estimates its classification performance, Ensemble classification algorithm is used in carrying out the experiment. This study is tested on RNA-Seq mosquito anopheles gambiae dataset. The results of the experiment obtained an output metrics with a 93.3% classification accuracy

    A Prediction Model for Bank Loans Using Agglomerative Hierarchical Clustering with Classification Approach

    Get PDF
    Businesses depend on banks for financing and other services. The success or failure of a company depends in large part on the ability of the industry to identify credit risk. As a result, banks must analyze whether or not a loan application will default in the future. To evaluate if a loan application was eligible for one, financial firms used highly competent personnel in the past. Machine learning algorithms and neural networks have been used to train class-sifters to forecast an individual's credit score based on their prior credit history, preventing loans from being provided to individuals who have failed on their obligations but these machine learning approaches require modification to solve difficulties such as class imbalance, noise, time complexity. Customers leaving a bank to go to a competitor is known as churn. Customers who can be predicted in advance to leave provide a firm an edge in client retention and growth. Banks may use machine learning to predict the behavior of trusted customers by assessing past data. To retain the trust of those clients, they may also introduce several unique deals. This study employed agglomerative hierarchical clustering, Decision Trees, and Random Forest Classification techniques. The data with decision tree obtained an accuracy of 84%, the data with the Random Forest obtained an accuracy of 85% and the clustered data passed through the agglomerative hierarchical clustering obtained an accuracy of 98.3% using random forest classifier and an accuracy of 98.1 % using decision tree classifier

    Customer Churn Prediction in Telecommunication Industry Using Classification and Regression Trees and Artificial Neural Network Algorithms

    Get PDF
    Customer churn is a serious problem, which is a critical issue encountered by large businesses and organizations. Due to the direct impact on the company's revenues, particularly in sectors such as the telecommunications as well as the banking, companies are working to promote ways to identify the churn of prospective consumers. Hence it is vital to investigate issues that influence customer churn to yield appropriate measures to diminish churn. The major objective of this work is to advance a model of churn prediction that helps telecom operatives to envisage clients that are most probable to be subjected to churn. The experimental approach for this study uses the machine learning procedures on the telecom churn dataset, using an improved Relief-F feature selection algorithm to pick related features from the huge dataset. To quantify the model's performance, the result of classification uses CART and ANN, the accuracy shows that ANN has a high predictive capacity of 93.88% compared to the 91.60% CART classifie

    Strengthening Bioinformatics and Genomics Analysis Skills in Africa for Attainment of the Sustainable Development Goals Report of the 2nd Conference of the Nigerian Bioinformatics and Genomics Network

    Get PDF
    The second conference of the Nigerian Bioinformatics and Genomics Network (NBGN21) was held from October 11 to October 13, 2021. The event was organized by the Nigerian Bioinformatics and Genomics Network. A 1-day genomic analysis workshop on genome-wide association study and polygenic risk score analysis was organized as part of the conference. It was organized primarily as a research capacity building initiative to empower Nigerian researchers to take a leading role in this cutting-edge field of genomic data science. The theme of the conference was “Leveraging Bioinformatics and Genomics for the attainments of the Sustainable Development Goals.” The conference used a hybrid approach—virtual and in-person. It served as a platform to bring together 235 registered participants mainly from Nigeria and virtually, from all over the world. NBGN21 had four keynote speakers and four leading Nigerian scientists received awards for their contributions to genomics and bioinformatics development in Nigeria. A total of 100 travel fellowships were awarded to delegates within Nigeria. A major topic of discussion was the application of bioinformatics and genomics in the achievement of the Sustainable Development Goals (SDG3—Good Health and Well-Being, SDG4—Quality Education, and SDG 15—Life on Land [Biodiversity]). In closing, most of the NBGN21 conference participants were interviewed and interestingly they agreed that bioinformatics and genomic analysis of African genomes are vital in identifying population-specific genetic variants that confer susceptibility to different diseases that are endemic in Africa. The knowledge of this can empower African healthcare systems and governments for timely intervention, thereby enhancing good health and well-bein

    Enhanced dimensionality reduction methods for classifying malaria vector dataset using decision tree

    No full text
    RNA-Seq data are utilized for biological applications and decision making for classification of genes. Lots of work in recent time are focused on reducing the dimension of RNA-Seq data. Dimensionality reduction approaches have been proposed in fetching relevant information in a given data. In this study, a novel optimized dimensionality reduction algorithm is proposed, by combining an optimized genetic algorithm with Principal Component Analysis and Independent Component Analysis (GA-O-PCA and GAO-ICA), which are used to identify an optimum subset and latent correlated features, respectively. The classifier uses Decision tree on the reduced mosquito anopheles gambiae dataset to enhance the accuracy and scalability in the gene expression analysis. The proposed algorithm is used to fetch relevant features based from the high-dimensional input feature space. A feature ranking and earlier experience are used. The performances of the model are evaluated and validated using the classification accuracy to compare existing approaches in the literature. The achieved experimental results prove to be promising for feature selection and classification in gene expression data analysis and specify that the approach is a capable accumulation to prevailing data mining techniques

    An Effective Intrusion Detection in Mobile Ad-hoc Network Using Deep Belief Networks and Long Short-Term Memory

    No full text
    A Mobile Ad-hoc Network (MANET) is a self-organizing collection of mobile devices communicating in a distributed fashion across numerous hops. MANETs are an appealing technology for many applications, including rescue operations, environmental monitoring, tactical operations, and so on, because they let people communicate without the usage of permanent infrastructure. This flexibility, however, creates additional security vulnerabilities. Because of its benefits and expanding demand, MANETs have attracted a lot of interest from the scientific community. They do, however, seem to be more vulnerable to numerous attacks that wreak havoc on their performance than any network. Traditional cryptography techniques cannot entirely defend MANETs in terms of fresh attacks and vulnerabilities due to the distributed architecture of MANETs; however, these issues can be overcome by using machine learning approaches-based intrusion detection systems (IDS). IDS, typically screening system processes and identifying intrusions, are commonly employed to supplement existing security methods because preventative techniques are never enough. Because MANETs are continually evolving, their highly limited nodes, and the lack of central observation stations, intrusion detection is a complex and tough process. Conventional IDSs are difficult to apply to them. Existing methodologies must be updated for MANETs or new approaches must be created. This paper aims to present a novel concept founded on deep belief networks (DBN) and long shortterm memory (LSTM) for MANET attack detection. The experimental analysis was performed on the probe, root to local, user to root, and denial of service (DoS) attacks. In the first phase of this paper, particle swarm optimization was used for feature selection, and subsequently, the DBN and LSTM were used for the classification of attacks in the MANET. The experimental results gave an accuracy reaching 99.46%, a sensitivity of 99.52%, and a recall of 99.52% for DBN and LSTM accuracy reaching 99.75%, a sensitivity of 99.79%, and a recall of 99.79%

    Phishing Detection in Blockchain Transaction Networks Using Ensemble Learning

    Get PDF
    The recent progress in blockchain and wireless communication infrastructures has paved the way for creating blockchain-based systems that protect data integrity and enable secure information sharing. Despite these advancements, concerns regarding security and privacy continue to impede the widespread adoption of blockchain technology, especially when sharing sensitive data. Specific security attacks against blockchains, such as data poisoning attacks, privacy leaks, and a single point of failure, must be addressed to develop efficient blockchain-supported IT infrastructures. This study proposes the use of deep learning methods, including Long Short-Term Memory (LSTM), Bi-directional LSTM (Bi-LSTM), and convolutional neural network LSTM (CNN-LSTM), to detect phishing attacks in a blockchain transaction network. These methods were evaluated on a dataset comprising malicious and benign addresses from the Ethereum blockchain dark list and whitelist dataset, and the results showed an accuracy of 99.72%

    A Linear Discriminant Analysis and Classification Model for Breast Cancer Diagnosis

    No full text
    Although most cases are identified at a late stage, breast cancer is the most public malignancy amongst women globally. However, mammography for the analysis of breast cancer is not routinely available at all general hospitals. Prolonging the period between detection and treatment for breast cancer may raise the likelihood of proliferating the disease. To speed up the process of diagnosing breast cancer and lower the mortality rate, a computerized method based on machine learning was created. The purpose of this investigation was to enhance the investigative accuracy of machine-learning algorithms for breast cancer diagnosis. The use of machine-learning methods will allow for the classification and prediction of cancer as either benign or malignant. This investigation applies the machine learning algorithms of random forest (RF) and the support vector machine (SVM) with the feature extraction method of linear discriminant analysis (LDA) to the Wisconsin Breast Cancer Dataset. The SVM with LDA and RF with LDA yielded accuracy results of 96.4% and 95.6% respectively. This research has useful applications in the medical field, while it enhances the efficiency and precision of a diagnostic system. Evidence from this study shows that better prediction is crucial and can benefit from machine learning methods. The results of this study have validated the use of feature extraction for breast cancer prediction when compared to the existing literature
    corecore